A generic lexicon tool for word model definition in multimodal applications
نویسنده
چکیده
This paper describes a generic lexicon tool which uses lexical representations and finite state transducers enhanced by arithmetic operations in DATR to generate individual output formats from a general phonological feature based representation. The tool was developed in connection with the lexicon component of a diagnostic evaluation toolkit, BEETLE, for a linguistic word recognition system. This lexicon is used online by the system to distinguish between actual and potential syllables and used offline for evaluation purposes with respect to a particular corpus. Rather than design a syllable lexicon which can only be used for these two tasks, it was decided to develop a more generic lexicon from which specific lexica can be generated on the fly; BEETLE is, therefore, only one application of the generic lexicon tool which can be used to generate output formats for other speech technology and multimodal applications.
منابع مشابه
COMPUTATIONAL LEXICOGRAPHY AND LEXICOLOGY Computational Processing of Czech Derived Words
Abstract The system presented in this paper is concerned with the computational processing of the selected types of Czech word-formation. The developed programming tool (word-formation module) aims at analysing and synthesising Czech derived words. Such a system is of particular value for automatic processing of Czech language where derivational morphology plays an important role in regular wor...
متن کاملA Feature Geometry Based Lexicon Model For Speech Applications
This paper proposes a generic feature geometry based lexical representation of phonological description using inheritance networks which has applications in various areas of speech technology. The feature geometry based lexicon model provides the linguistic knowledge for a generic lexicon tool which allows application-specific lexica to be generated. The relevance of this approach to interactiv...
متن کاملبررسی و مقایسه رشد جنبه محتوایی مهارت تعریف واژه در دانشآموزان 7 تا 12 ساله فارسیزبان
Objective Language has three components: content, form and pragmatic. The content includes the semantic components. Semantic knowledge of word relationships requires awareness of the relationships between different words in the same field and other fields. One of the main components of the semantic is the mental lexicon that many of the semantic communications, including the organization and se...
متن کاملChinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
This paper presents a pragmatic approach to Chinese word segmentation. It differentiates from most of the previous approaches mainly in three respects. First of all, while theoretical linguists have defined Chinese words with various linguistic criteria, Chinese words in this study are defined pragmatically as segmentation units whose definition depends on how they are used and processed in rea...
متن کاملLearning words from sights and sounds: a computational model
This paper presents a model of word acquisition which learns from multimodal sensory input. Set in an information theoretic framework, the model acquires a lexicon by nding and statistically modeling consistent inter-modal structure. Learning is achieved from multimodal sensor data without any human annotation. An implementation of this model was able to acquire a primitive audio-visual lexicon...
متن کامل